Skip to content

Proposing ZendMM Observer API #11758

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 26 commits into
base: master
Choose a base branch
from

Conversation

realFlowControl
Copy link
Contributor

@realFlowControl realFlowControl commented Jul 21, 2023

Hello everyone,

this PR proposes a ZendMM Observer API which allows for observing memory allocations in the Zend Memory Manager. The only current way to observer the ZendMM is via the zend_mm_set_custom_handlers() function which is meant to bring your own memory manager. Using this API has some side effects which you need to work around:

  • ZendMM won't cleanup allocated memory anymore in shutdown
  • ZendMM won't cleanup allocated memory anymore in case the memory limit was reached
  • ZendMM’s zend_mm_gc() function won't do anything

Additionally you need to be aware of neighbouring extensions and handle everything in order.

This proposal flips this by keeping track of observers and not interfering with ZendMM internals. It will only be an observer that calls observing functions on malloc, free and realloc and ZendMM garbage collection (gc_mem_caches() user land call or ZendMM internal cleanup when allocating but no free slots available).
It has no overhead as long as no observer is installed or when all observers that are installed are disabled.

An example of how to use the API can be found at https://github.com/realFlowControl/zendmmobserver/blob/main/zendmmobserver.c

Big picture:

  • an extension registers its observers via zend_mm_observers_register()
  • after the module startup the ZendMM observers will be enabled (from this point on the observing functions will be called if any)
  • during runtime an extension may call zend_mm_observer_disable() / zend_mm_observer_enable() to disable / enable a registered observer
  • before module shutdown the ZendMM observers will be shutdown and unregistered

UPDATE
We made it less complex and it looks like this:

  • zend_mm_observers_register() can be called in and after RINIT
  • you may unregister using zend_mm_observers_unregister() in or before RSHUTDOWN
  • in case you are not unregistering, observers will be unregistered in zend_mm_shutdown() for you after every request
  • this also allows to install and uninstall an observer during a request, maybe via exposing it to userland

@bwoebi bwoebi requested review from iluuu1994 and removed request for bukka July 21, 2023 13:42
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps the ZEND_API parts of this file should be moved to Zend/observer.h?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, there is already a Zend/zend_observer.h file, but this is for the Zend Engine. I think this could lead to misunderstandings. Personally I am totally okay with having this in Zend/zend_alloc.h as everything regarding the ZendMM is there. We might consider moving it to a Zend/zend_alloc_observer.h file to keep the naming clear.
Personally I would leave it as is, but I do not have any hard feelings the one way or the other.

Copy link
Member

@arnaud-lb arnaud-lb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for this!

Is there a particular reason for requiring that observers are registered during startup/MINIT? Assuming a memory profiling extension, registering observers during MINIT adds unnecessary overhead when not profiling.

@bwoebi
Copy link
Member

bwoebi commented Jul 21, 2023

Assuming a memory profiling extension, registering observers during MINIT adds unnecessary overhead when not profiling.

@arnaud-lb The overhead would be only once per ginit, with a relatively low overhead. I think that's pretty reasonable.

The alternative would be that memory profiling extensions have to register in each rinit, and PHP would have to free these in rshutdowns (or at least check for it, even if nothing happens). Not sure whether doing that would really have less overhead.

@arnaud-lb
Copy link
Member

@arnaud-lb The overhead would be only once per ginit, with a relatively low overhead. I think that's pretty reasonable.

The alternative would be that memory profiling extensions have to register in each rinit, and PHP would have to free these in rshutdowns (or at least check for it, even if nothing happens). Not sure whether doing that would really have less overhead.

I was referring to overhead in the allocation functions, since they take the "custom heap" path when some observers are enabled. But I was missing that extensions would disable the observers when they are not needed. As long as they do, this will have no overhead, so that's fine.

@iluuu1994
Copy link
Member

iluuu1994 commented Jul 24, 2023

Is there a particular reason for requiring that observers are registered during startup/MINIT? Assuming a memory profiling extension, registering observers during MINIT adds unnecessary overhead when not profiling.

Probably because zend_mm_observers is a true global, and thus we'd need some synchronization locking for adding new observers. E.g.

  1. Allocate new zend_mm_heap_observer
  2. Fetch current zend_mm_observers global
  3. Assign current zend_mm_observers to zend_mm_heap_observer.next
  4. Assign new zend_mm_heap_observer to zend_mm_observers global

We can have a race condition between steps 2 and 4, losing the value from the thread that finished first. Furthermore, we'd need to move this to a shared global for NTS. If we could register observers dynamically we would likely also need to offer API to unregister them.

As you mentioned, an extension that registers an observer could disable it by default unless profiling was requested.

Edit: Oh, they are appended. But the issue remains.

@arnaud-lb
Copy link
Member

arnaud-lb commented Jul 24, 2023

Is there a particular reason for requiring that observers are registered during startup/MINIT? Assuming a memory profiling extension, registering observers during MINIT adds unnecessary overhead when not profiling.

Probably because zend_mm_observers is a true global, and thus we'd need some synchronization locking for adding new observers. E.g.

I got that we can not mutate zend_mm_observers due to concurrency, but does zend_mm_observers need to be a true global? This avoids allocations because observers are registered once, but this adds complexity:

  • zend_mm_observers need to be copied to thread-locals because they are mutated by the enable/disable APIs, and because we need the enabled/disabled state to be thread-local.
  • The lifetime of the copy spans multiple requests, so the enabled/disabled state persists across requests. This requires the extension to take care of checking or updating the state of its observers at the beginning of each request.

Couldn't we obtain the same functionality if zend_mm_observers was thread-local, and observers was registered just for one request? This requires one malloc/free per observer (are extension expected to register more than one?), but this may not be significant compared to the overhead of calling the observer (even with a no-op observer).

@iluuu1994
Copy link
Member

@arnaud-lb Of course, in my comment I missed that observers can be enabled/disabled per request and so can't be shared across processes/threads. Your suggestion to enable per request sounds simpler. The overhead of the allocation per request is negligible, and only relevant when memory profiling. Memory profiling leaking across requests doesn't seem particularly useful.

@realFlowControl
Copy link
Contributor Author

realFlowControl commented Jul 25, 2023

Hey @arnaud-lb and @iluuu1994,
thanks for your feedback 👍 (and sorry for not being very responsive, I was sick yesterday)
You are saying that we could make this easier by calling zend_mm_observers_register() in RINIT (and store observers directly on the AG(mm_heap)->observers list) and a zend_mm_observers_unregister() in RSHUTDOWN? (we may also omit the unregister and make this internally in the zend_mm_shutdown() when full == false)
This would solve some of the problems you mentioned, especially the syncing from global to thread local. Having the observers stay active across requests was intentional as we'd save an allocation, the corresponding free and traversing the linked list per request. I do understand that you say this is negligible (I personally was not sure about this), so I'll change the implementation to this:

  • zend_mm_observers_register() in RINIT (passing malloc, free, and realloc function pointers)
    • register directly to the AG(mm_heap)->observers
    • set AG(mm_heap)->use_custom_heap flag
  • unregister in zend_mm_shutdown() when full == false
    • free observers list
    • unset AG(mm_heap)->use_custom_heap flag

I'll play with this and check the usage, but I think it makes sense this way especially when the malloc() and the free() are negligible.
About the unregister: I'll play with this and see. When we say an extension needs to call the unregister, we should make sure to unregister internally before extension are dlclose()ed, as there are some free's happening after dlclose() and an extension could just not unregister and will crash. Unregistering in zend_mm_shutdown() would add a NULL check on the AG(mm_heap)->observers for everyone, probably also negligible?

What do you think?

@realFlowControl
Copy link
Contributor Author

realFlowControl commented Jul 25, 2023

One thing in favour of having a unregister function is that this would allow an extension to expose observing memory allocations to userspace (add a memprof_enable() and memprof_disable() function to PHP) and enable / disable at will. We may not use it this way, but it would be certainly possible!
It currently is possible using the zend_mm_set_custom_handlers() API.

@realFlowControl realFlowControl force-pushed the zendmm-observer branch 3 times, most recently from a893bbf to c0d92b2 Compare July 25, 2023 10:54
@arnaud-lb
Copy link
Member

Hey @arnaud-lb and @iluuu1994, thanks for your feedback +1 (and sorry for not being very responsive, I was sick yesterday)

No worries, and I hope you are feeling better now!

You are saying that we could make this easier by calling zend_mm_observers_register() in RINIT (and store observers directly on the AG(mm_heap)->observers list) and a zend_mm_observers_unregister() in RSHUTDOWN? (we may also omit the unregister and make this internally in the zend_mm_shutdown() when full == false) This would solve some of the problems you mentioned, especially the syncing from global to thread local. Having the observers stay active across requests was intentional as we'd save an allocation, the corresponding free and traversing the linked list per request. I do understand that you say this is negligible (I personally was not sure about this), so I'll change the implementation to this:

* `zend_mm_observers_register()` in RINIT (passing malloc, free, and realloc function pointers)
  
  * register directly to the `AG(mm_heap)->observers`
  * set `AG(mm_heap)->use_custom_heap` flag

* unregister in `zend_mm_shutdown()` when `full == false`
  
  * free observers list
  * unset `AG(mm_heap)->use_custom_heap` flag

I'll play with this and check the usage, but I think it makes sense this way especially when the malloc() and the free() are negligible. About the unregister: I'll play with this and see. When we say an extension needs to call the unregister, we should make sure to unregister internally before extension are dlclose()ed, as there are some free's happening after dlclose() and an extension could just not unregister and will crash. Unregistering in zend_mm_shutdown() would add a NULL check on the AG(mm_heap)->observers for everyone, probably also negligible?

What do you think?

Thank you for these changes. This is simpler and it's probably easier to use the API in the right way as well.

Agreed that we should unregister observers in mm_shutdown (including when full=false), as this would also prevent accidentally leaking observers to next requests.

I will try the API on memprof later this week (probably on Friday) to check that it's suitable for more than one extension.

Copy link
Member

@iluuu1994 iluuu1994 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks much simpler, thank you @realFlowControl!

@iluuu1994
Copy link
Member

I do think we should have some simple tests in ext/zend_test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants